Re: Fwd: Apple Darwin disabled fsync? - Mailing list pgsql-hackers

From Peter Bierman
Subject Re: Fwd: Apple Darwin disabled fsync?
Date
Msg-id a06010200be3eebfde545@[17.202.21.231]
Whole thread Raw
In response to Re: Fwd: Apple Darwin disabled fsync?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Fwd: Apple Darwin disabled fsync?  (Greg Stark <gsstark@mit.edu>)
List pgsql-hackers
At 12:38 AM -0500 2/20/05, Tom Lane wrote:
>Dominic Giampaolo <dbg@apple.com> writes:
>>>  I believe that what the above comment refers to is the fact that
>>>  fsync() is not sufficient to guarantee that your data is on stable
>>>  storage and on MacOS X we provide a fcntl(), called F_FULLFSYNC,
>>>  to ask the drive to flush all buffered data to stable storage.
>
>I've been looking for documentation on this without a lot of luck
>("man fcntl" on OS X 10.3.8 has certainly never heard of it).
>It's not completely clear whether this subsumes fsync() or whether
>you're supposed to fsync() and then use the fcntl.

My understanding is that you're supposed to fsync() and then use the 
fcntl, but I'm not the filesystems expert. (Dominic, who wrote the 
original message that I forwarded, is.)

I've filed a bug report asking for better documentation about this to 
be placed in the fsync man page. <radar://4012378>


>Also, isn't it fundamentally at the wrong level?  One would suppose that
>the drive flush operation is going to affect everything the drive
>currently has queued, not just the one file.  That makes it difficult
>if not impossible to use efficiently.

I think the intent is to make the fcntl more accurate in time, as the 
ability to do so appears in hardware.

One of the advantages Apple has is the ability to set very specific 
requirements for our hardware. So if a block specific flush command 
becomes part of the ATA spec, Apple can require vendors to support 
it, and support it correctly, before using those drives.

On the other hand, as Dominic described, once the hardware is 
external (like a firewire enclosure), we lose that leverage.


At 12:42 PM -0500 2/20/05, Greg Stark wrote:
>Dominic Giampaolo <dbg@apple.com> writes:
>
>>  > In most cases you do not need such a heavy handed operation and fsync() is
>>  > good enough.
>
>Really? Can you think of a single application for which this definition of
>fsync is useful?
>
>Kernel buffers are transparent to the application, just as the disk buffer is.
>It doesn't matter to an application whether the data is sitting in a kernel
>buffer, or a buffer in the disk, it's equivalent. If fsync doesn't guarantee
>the writes actually end up on non-volatile disk then as far as the application
>is concerned it's just an expensive noop.

I think the intent of fsync() is closer to what you describe, but the 
convention is that fsync() hands responsibility to the disk hardware. 
That's how every other Unix seems to handle fsync() too. This gives 
you good performance, and if you combine a smart fsync()ing 
application with reliable storage hardware (like an XServe RAID that 
battery backs it's own write caches), you get the best combination.

If you know you have unreliable hardware, and critical reliability 
issues, then you can use the fcntl, which seems to be more control 
than other OSes give.

-pmb


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: SMP buffer management test question
Next
From: Curt Sampson
Date:
Subject: Time Zone Names Problem